Frequent Pattern Mining in Attributed Trees

نویسندگان

  • Claude Pasquier
  • Jérémy Sanhes
  • Frédéric Flouvat
  • Nazha Selmaoui-Folcher
چکیده

Frequent pattern mining is an important data mining task with a broad range of applications. Initially focused on the discovery of frequent itemsets, studies were extended to mine structural forms like sequences, trees or graphs. In this paper, we introduce a new data mining method that consists in mining new kind of patterns in a collection of attributed trees (atrees). Attributed trees are trees in which vertices are associated with itemsets. Mining this type of patterns (called asubtrees), which combines tree mining and itemset mining, requires the exploration of a huge search space. We present several new algorithms for attributed trees mining and show that their implementations can efficiently list frequent patterns in a database of several thousand of attributed trees.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining XML Frequent Query Patterns

With XML being the standard for data encoding and exchange over Internet, how to find the interesting XML query characteristic efficiently becomes a critical issue. Mining frequent query pattern is a technique to discover the most frequently occurring query pattern trees from a large collection of XML queries. In this paper, we describe an efficient mining algorithm to discover the frequent que...

متن کامل

Canonical Forms for Labeled Trees and Their Applications in Frequent Subtree Mining

Tree structures are used extensively in domains such as computational biology, pattern recognition, XML databases, computer networks, and so on. In this paper, we first present two canonical forms for labeled rooted unordered trees–the breadth-first canonical form (BFCF) and the depth-first canonical form (DFCF). Then the canonical forms are applied to the frequent subtree mining problem. Based...

متن کامل

Indexing and Mining Free Trees

Tree structures are used extensively in domains such as computational biology, pattern recognition, computer networks, and so on. In this paper, we present an indexing technique for free trees and apply this indexing technique to the problem of mining frequent subtrees. We first define a novel representation, the canonical form, for rooted trees and extend the definition to free trees. We also ...

متن کامل

Mining of Users’ Access Behaviour for Frequent Sequential Pattern from Web Logs

Sequential Pattern mining is the process of applying data mining techniques to a sequential database for the purposes of discovering the correlation relationships that exist among an ordered list of events. The task of discovering frequent sequences is challenging, because the algorithm needs to process a combinatorially explosive number of possible sequences. Discovering hidden information fro...

متن کامل

Mining Frequent Rooted Trees and Free Trees Using Canonical Forms

Tree structures are used extensively in domains such as computational biology, pattern recognition, XML databases, computer networks, and so on. In this paper, we present HybridTreeMiner, a computationally efficient algorithm that discovers all frequently occurring subtrees in a database of rooted unordered trees. The algorithm mines frequent subtrees by traversing an enumeration tree that syst...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013